Reinforcement Learning for on-line Sequence Transformation

نویسندگان

چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On-Line EM Reinforcement Learning

In this article, we propose a new reinforcement learning (RL) method for a system having continuous state and action spaces. Our RL method has an architecture like the actorcritic model. The critic tries to approximate the Q-function, which is the expected future return for the current state-action pair. The actor tries to approximate a stochastic soft-max policy defined by the Q-function. The ...

متن کامل

Tree-Based On-Line Reinforcement Learning

Fitted Q-iteration (FQI) stands out among reinforcement learning algorithms for its flexibility and ease of use. FQI can be combined with any regression method, and this choice determines the algorithm’s statistical and computational properties. The combination of FQI with an ensemble of regression trees gives rise to an algorithm, FQIT, that is computationally efficient, scalable to high dimen...

متن کامل

Sparse Distributed Memories for On-Line Value-Based Reinforcement Learning

In this paper, we advocate the use of Sparse Distributed Memories (SDMs) for on-line, value-based reinforcement learning (RL). SDMs provide a linear, local function approximation scheme, designed to work when a very large/ high-dimensional input (address) space has to be mapped into a much smaller physical memory. We present an implementation of the SDM architecture for on-line, value-based RL ...

متن کامل

Modular on-line function approximation for scaling up reinforcement learning

Reinforcement l e a r n i n g i s a p o werful learning paradigm for autonomous agents which i n teract with unknown environments with the objective of maximizing cumulative p a yoo. Recent research has addressed issues concerning the scaling up of reinforcement learning methods in order to solve problems with large state spaces, composite tasks and tasks involving non-Markovian situations. In ...

متن کامل

Expected Mistake Bound Model for On-Line Reinforcement Learning

We propose a model of eecient on-line reinforcement learning based on the expected mistake bound framework introduced by Haussler, Littlestone and Warmuth (1987). The measure of performance we use is the expected diierence between the total reward received by the learning agent and that received by an agent behaving optimally from the start. We call this expected diierence the cumulative mistak...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Computer Science and Information Systems (FedCSIS), 2019 Federated Conference on

سال: 2022

ISSN: ['2300-5963']

DOI: https://doi.org/10.15439/2022f70